Scalable Task-Oriented Parallelism for Structure Based Incomplete LU Factorization
نویسندگان
چکیده
ILU(k) is an important preconditioner widely used in many linear algebra solvers for sparse matrices. Unfortunately, there is still no highly scalable parallel ILU(k) algorithm. This paper presents the first such scalable algorithm. For example, the new algorithm achieves 50 times speedup with 80 nodes for general sparse matrices of dimension 160,000 that are diagonally dominant. The algorithm assumes that each node has sufficient memory to hold the matrix. The parallelism is task-oriented. We present experimental results for k = 1 and k = 2, which are the most commonly used cases in the practical applications. The results are presented for three platforms: a departmental cluster with Gigabit Ethernet; a high-performance cluster using an InfiniBand interconnect; and a simulation of a Grid computation with two or three participating sites.
منابع مشابه
ILUM: A Multi-Elimination ILU Preconditioner for General Sparse Matrices
Standard preconditioning techniques based on incomplete LU (ILU) factorizations offer a limited degree of parallelism, in general. A few of the alternatives advocated so far consist of either using some form of polynomial preconditioning, or applying the usual ILU factorization to a matrix obtained from a multicolor ordering. In this paper we present an incomplete factorization technique based ...
متن کاملComputing a block incomplete LU preconditioner as the by-product of block left-looking A-biconjugation process
In this paper, we present a block version of incomplete LU preconditioner which is computed as the by-product of block A-biconjugation process. The pivot entries of this block preconditioner are one by one or two by two blocks. The L and U factors of this block preconditioner are computed separately. The block pivot selection of this preconditioner is inherited from one of the block versions of...
متن کاملMulti-objective and Scalable Heuristic Algorithm for Workflow Task Scheduling in Utility Grids
To use services transparently in a distributed environment, the Utility Grids develop a cyber-infrastructure. The parameters of the Quality of Service such as the allocation-cost and makespan have to be dealt with in order to schedule workflow application tasks in the Utility Grids. Optimization of both target parameters above is a challenge in a distributed environment and may conflict one an...
متن کاملA Comparison of D and D Data Mapping for Sparse LU Factorization with Partial Pivoting
This paper presents a comparative study of two data mapping schemes for parallel sparse LU factorization with partial pivoting on distributed memory machines Our previous work has developed an approach that incorporates static symbolic factoriza tion nonsymmetric L U supernode partitioning and graph scheduling for this problem with D column block mapping The D mapping is commonly considered mor...
متن کاملEnhanced Parallel Multicolor Preconditioning Techniques for Linear Systems
When solving a linear system in parallel, a large overhead in using an incomplete LU factorization as a preconditioner may annihilate any gains made from the improved convergence. This overhead is due to the inherently sequential nature of such a preconditioning. Multicoloring of the subdomains assigned to processors is a common remedy for increasing the parallelism of a global ordering. Howeve...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/0803.0048 شماره
صفحات -
تاریخ انتشار 2008